K-means recovers ICA filters when independent components are sparse

نویسندگان

  • Alon Vinnikov
  • Shai Shalev-Shwartz
چکیده

Unsupervised feature learning is the task of using unlabeled examples for building a representation of objects as vectors. This task has been extensively studied in recent years, mainly in the context of unsupervised pre-training of neural networks. Recently, Coates et al. (2011) conducted extensive experiments, comparing the accuracy of a linear classifier that has been trained using features learnt by several unsupervised feature learning methods. Surprisingly, the best performing method was the simplest feature learning approach that was based on applying the Kmeans clustering algorithm after a whitening of the data. The goal of this work is to shed light on the success of K-means with whitening for the task of unsupervised feature learning. Our main result is a close connection between Kmeans and ICA (Independent Component Analysis). Specifically, we show that K-means and similar clustering algorithms can be used to recover the ICA mixing matrix or its inverse, the ICA filters. It is well known that the independent components found by ICA form useful features for classification (Le et al., 2012; 2011; 2010), hence the connection between K-mean and ICA explains the empirical success of K-means as a feature learner. Moreover, our analysis underscores the significance of the whitening operation, as was also observed in the experiments reported in Coates et al. (2011). Finally, our analysis leads to a better initialization of K-means for the task of feature learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Edges are the Independent Components of Natural Scenes

Field (1994) has suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and Barlow (1989) has reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that non-linear 'infoma...

متن کامل

The " Independent Components " of Scenes are Edge Filters

Natural It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsu...

متن کامل

The “independent components” of natural scenes are edge filters

It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised...

متن کامل

Beyond Independent Components

Independent component analysis (ICA) attempts to nd a linear decomposition of observed data vectors into components that are statistically independent. It is well known, however, that such a decomposition cannot be exactly found, and in many practical applications, independence is not achieved even approximately. This raises the question on the utility and interpretation of the components given...

متن کامل

Sparse ICA via cluster-wise PCA

In this paper, it is shown that independent component analysis (ICA) of sparse signals (sparse ICA) can be seen as a cluster-wise principal component analysis (PCA). Consequently, Sparse ICA may be done by a combination of a clustering algorithm and PCA. For the clustering part, we use, in this paper, an algorithm inspired from K-means. The final algorithm is easy to implement for any number of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014